Goto

Collaborating Authors

 dataset-aware mixture-of-expert


DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets Supplementary Material

Neural Information Processing Systems

Chen et al. [2022] proves that in such a binary-classification problem, an MoE layer converges to an Proof: We will prove this by contradiction. Thus, we show that vanilla MoE does not guarantee convergence with mixture of datasets.


DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets Supplementary Material

Neural Information Processing Systems

Chen et al. [2022] proves that in such a binary-classification problem, an MoE layer converges to an Proof: We will prove this by contradiction. Thus, we show that vanilla MoE does not guarantee convergence with mixture of datasets.